Search results for "Source code"
showing 10 items of 61 documents
Online Management of Hybrid DRAM-NVMM Memory for HPC
2019
Non-volatile main memories (NVMMs) offer a comparable performance to DRAM, while requiring lower static power consumption and enabling higher densities. NVMM therefore can provide opportunities for improving both energy efficiency and costs of main memory. Previous hybrid main memory management approaches for HPC either do not consider the unique characteristics of NVMMs, depend on high profiling costs, or need source code modifications. In this paper, we investigate HPC applications' behaviors in the presence of NVMM as part of the main memory. By performing a comprehensive study of HPC applications and based on several key observations, we propose an online hybrid memory architecture for …
FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures
2020
Deep Learning is ubiquitous in a wide field of applications ranging from research to industry. In comparison to time-consuming iterative training of convolutional neural networks (CNNs), inference is a relatively lightweight operation making it amenable to execution on mobile devices. Nevertheless, lower latency and higher computation efficiency are crucial to allow for complex models and prolonged battery life. Addressing the aforementioned challenges, we propose FeatherCNN – a fast inference library for ARM CPUs – targeting the performance ceiling of mobile devices. FeatherCNN employs three key techniques: 1) A highly efficient TensorGEMM (generalized matrix multiplication) routine is app…
Interoperable real-time symbolic programming for smart environments
2019
Smart environments demand novel paradigms offering easy configuration, programming and deployment of pervasive applications. To this purpose, different solutions have been proposed ranging from visual paradigms based on mashups to formal languages. However, most of the paradigms proposed in the literature require further external tools to turn application description code into an executable program before the deployment on target devices. Source code generation, runtime upgrades and recovery, and online debugging and inspection are often cumbersome in these programming environments. In this work we describe a methodology for real-time and on-line programming in smart environments that is co…
Recentrifuge: Robust comparative analysis and contamination removal for metagenomics
2017
Metagenomic sequencing is becoming widespread in biomedical and environmental research, and the pace is increasing even more thanks to nanopore sequencing. With a rising number of samples and data per sample, the challenge of efficiently comparing results within a specimen and between specimens arises. Reagents, laboratory, and host related contaminants complicate such analysis. Contamination is particularly critical in low microbial biomass body sites and environments, where it can comprise most of a sample if not all. Recentrifuge implements a robust method for the removal of negative-control and crossover taxa from the rest of samples. With Recentrifuge, researchers can analyze results f…
2016
The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet accurate software solutions is of high importance to research, since availability and size of NGS datasets continue to increase. In this work we present an efficient parallelization approach for NGS short-read alignment on multi-core clusters. Our approach takes advantage of a distributed shared memory programming model based on the new UPC++ language. Experimental results using the CUSHAW3 alig…
parSRA: A framework for the parallel execution of short read aligners on compute clusters
2018
The growth of next generation sequencing datasets poses as a challenge to the alignment of reads to reference genomes in terms of both accuracy and speed. In this work we present parSRA, a parallel framework to accelerate the execution of existing short read aligners on distributed-memory systems. parSRA can be used to parallelize a variety of short read alignment tools installed in the system without any modification to their source code. We show that our framework provides good scalability on a compute cluster for accelerating the popular BWA-MEM and Bowtie2 aligners. On average, it is able to accelerate sequence alignments on 16 64-core nodes (in total, 1024 cores) with speedup of 10.48 …
ParDRe: faster parallel duplicated reads removal tool for sequencing studies
2016
This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record [insert complete citation information here] is available online at: https://doi.org/10.1093/bioinformatics/btw038 [Abstract] Summary: Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe , a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of S…
MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems
2016
This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of recordJorge González-Domínguez, Yongchao Liu, Juan Touriño, Bertil Schmidt; MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems, Bioinformatics, Volume 32, Issue 24, 15 December 2016, Pages 3826–3828, https://doi.org/10.1093/bioinformatics/btw558is available online at: https://doi.org/10.1093/bioinformatics/btw558 [Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-sca…
Simulation-based estimation of branching models for LTR retrotransposons
2017
Abstract Motivation LTR retrotransposons are mobile elements that are able, like retroviruses, to copy and move inside eukaryotic genomes. In the present work, we propose a branching model for studying the propagation of LTR retrotransposons in these genomes. This model allows us to take into account both the positions and the degradation level of LTR retrotransposons copies. In our model, the duplication rate is also allowed to vary with the degradation level. Results Various functions have been implemented in order to simulate their spread and visualization tools are proposed. Based on these simulation tools, we have developed a first method to evaluate the parameters of this propagation …
Multitemporal Mosaicing for Sentinel-3/FLEX Derived Level-2 Product Composites
2020
The increasing availability of remote sensing data raises important challenges in terms of operational data provision and spatial coverage for conducting global studies and analyses. In this regard, existing multitemporal mosaicing techniques are generally limited to producing spectral image composites without considering the particular features of higher-level biophysical and other derived products, such as those provided by the Sentinel-3 (S3) and Fluorescence Explorer (FLEX) tandem missions. To relieve these limitations, this article proposes a novel multitemporal mosaicing algorithm specially designed for operational S3-derived products and also studies its applicability within the FLEX…